Weighted frequency warping for voice conversion

نویسندگان

Daniel Erro

Asunción Moreno

چکیده

This paper presents a new voice conversion method called Weighted Frequency Warping (WFW), which combines the well known GMM approach and the frequency warping approach. The harmonic plus stochastic model has been used to analyze, modify and synthesize the speech signal. Special phase manipulation procedures have been designed to allow the system to work in pitch-asynchronous mode. The experiments show that the proposed technique reaches a high degree of similarity between the converted and target speakers, and the naturalness and quality of the resynthesized speech is much higher than those of classical GMM-based systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

不需平行語料而基於共振峰與線頻譜頻率映對之語者特質轉換系統 (A Voice Conversion System based on Formant and LSF Mapping without Using Parallel Corpus) [In Chinese]

Voice conversion has been used in many applications. The methods based on vector quantization codebook and Gaussian mixture models need dynamic time warping on parallel sentence corpus for generating mapping functions. Recent study tries to use less training data, and even without parallel sentence corpus. This paper presents a voice conversion method without using parallel sentence corpus. It ...

متن کامل

A flexible and modular crosslingual voice conversion system

A cross-lingual voice conversion system aims at modifying the timbral structure of recorded sentences from a source speaker, in order to obtain processed sentences which are perceived as the same sentences uttered by a target speaker. This work presents the cross-lingual voice conversion problem as a network of related sub-problems and discuss several techniques for solving each of these sub-pr...

متن کامل

Reducing one-to-many problem in Voice Conversion by equalizing the formant locations using dynamic frequency warping

In this study, we investigate a solution to reduce the effect of oneto-many problem in voice conversion. One-to-many problem in VC happens when two very similar speech segments in source speaker have corresponding speech segments in target speaker that are not similar to each other. As a result, the mapper function usually oversmoothes the generated features in order to be similar to both targe...

متن کامل

Voice Conversion through Residual Warping in a Sparse, Anchor-based Representation of Speech

In previous work we presented a Sparse, Anchor-Based Representation of speech (SABR) that uses phonemic “anchors” to represent an utterance with a set of sparse non-negative weights. SABR is speaker-independent: combining weights from a source speaker with anchors from a target speaker can be used for voice conversion. Here, we present an extension of the original SABR that significantly improv...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Weighted frequency warping for voice conversion

نویسندگان

چکیده

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

不需平行語料而基於共振峰與線頻譜頻率映對之語者特質轉換系統 (A Voice Conversion System based on Formant and LSF Mapping without Using Parallel Corpus) [In Chinese]

A flexible and modular crosslingual voice conversion system

Reducing one-to-many problem in Voice Conversion by equalizing the formant locations using dynamic frequency warping

Voice Conversion through Residual Warping in a Sparse, Anchor-based Representation of Speech

عنوان ژورنال:

اشتراک گذاری